High Performance Computing Systems with Various Checkpointing Schemes

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New High Performance Checkpointing Approach for Mobile Computing Systems

In this paper, we present a single phase non-blocking coordinated checkpointing algorithm suitable for mobile computing environments. The distinct advantages that make the proposed algorithm suitable for distributed mobile computing systems are the following. It produces a consistent set of checkpoints, without the overhead of taking temporary checkpoints; the algorithm makes sure that only min...

متن کامل

A Survey and Performance Analysis of Checkpointing and Recovery Schemes for Mobile Computing Systems

A SURVEY AND PERFORMANCE ANALYSIS OF CHECKPOINTING AND RECOVERY SCHEMES FOR MOBILE COMPUTING SYSTEMS Ruchi Tuli1 and Parveen Kumar2 1Yanbu University College, Royal Commission for Jubail and Yanbu, Directorate General for Yanbu, P.O. Box 30436 Madinat Yanbu Al Sinaiyah Kingdom of Saudi Arabia., E-mail : [email protected] 2Merrut Institute of Engineering and Technology, Merrut (INDIA) E-mail ...

متن کامل

Distributed Computing Systems and Checkpointing

This paper examines the performance of synchronous checkpointing in a distributed computing environment with and without load redistribution. Performance models are developed, and optimum checkpoint intervals are determined. The analysis extends earlier work by allowing for multiple nodes, state dependent checkpoint intervals, and a performance metric which is coupled with failurefree performan...

متن کامل

Adaptive Two-Level Blocking Coordinated Checkpointing for High Performance Cluster Computing Systems

Blocking coordinated checkpointing is a well-known method for achieving fault tolerance in cluster computing systems. In this work, we introduce a new approach for blocking coordinated checkpointing using two-level checkpointing. The first level of checkpointing is local checkpointing, and computing nodes save the checkpoints in local disk. If a transient failure occurs in the computing node, t...

متن کامل

Analysis of Checkpointing Schemes for Multiprocessor Systems

Parallel computing systems provide hardware redundancy that helps t o achieve low cost fault-tolerance, by duplicating the task into more than a single processor, and comparing the states of the processors a t checkpoints. This paper suggests a novel technique, based on a Markov Reward Model (MRM) , f o r analyzing the performance of checkpointing schemes with task duplication. W e show how thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computers Communications & Control

سال: 2009

ISSN: 1841-9836,1841-9836

DOI: 10.15837/ijccc.2009.4.2455